22 research outputs found
Revisiting Data Complexity Metrics Based on Morphology for Overlap and Imbalance: Snapshot, New Overlap Number of Balls Metrics and Singular Problems Prospect
Data Science and Machine Learning have become fundamental assets for
companies and research institutions alike. As one of its fields, supervised
classification allows for class prediction of new samples, learning from given
training data. However, some properties can cause datasets to be problematic to
classify.
In order to evaluate a dataset a priori, data complexity metrics have been
used extensively. They provide information regarding different intrinsic
characteristics of the data, which serve to evaluate classifier compatibility
and a course of action that improves performance. However, most complexity
metrics focus on just one characteristic of the data, which can be insufficient
to properly evaluate the dataset towards the classifiers' performance. In fact,
class overlap, a very detrimental feature for the classification process
(especially when imbalance among class labels is also present) is hard to
assess.
This research work focuses on revisiting complexity metrics based on data
morphology. In accordance to their nature, the premise is that they provide
both good estimates for class overlap, and great correlations with the
classification performance. For that purpose, a novel family of metrics have
been developed. Being based on ball coverage by classes, they are named after
Overlap Number of Balls. Finally, some prospects for the adaptation of the
former family of metrics to singular (more complex) problems are discussed.Comment: 23 pages, 9 figures, preprin
mldr.resampling: Efficient Reference Implementations of Multilabel Resampling Algorithms
Resampling algorithms are a useful approach to deal with imbalanced learning
in multilabel scenarios. These methods have to deal with singularities in the
multilabel data, such as the occurrence of frequent and infrequent labels in
the same instance. Implementations of these methods are sometimes limited to
the pseudocode provided by their authors in a paper. This Original Software
Publication presents mldr.resampling, a software package that provides
reference implementations for eleven multilabel resampling methods, with an
emphasis on efficiency since these algorithms are usually time-consuming
Nuevas arquitecturas hardware de procesamiento de alto rendimiento para aprendizaje profundo
El diseño y fabricación de hardware es costoso, tanto en tiempo como en inversión económica, razón por la que los circuitos integrados se fabrican siempre en gran volumen, para aprovechar la economía de escala. Por esa razón la mayoría de procesadores fabricados son de propósito general, ampliando así su campo de aplicaciones. En los últimos años, sin embargo, cada vez se fabrican más procesadores para aplicaciones específicas, entre ellos aquellos destinados a acelerar el trabajo con redes neuronales profundas. Este artículo introduce la necesidad de este tipo de hardware especializado, describiendo su finalidad, funcionamiento e implementaciones actuales.The design and manufacture of hardware is expensive, both in time and in economic investment, which is why integrated circuits are always manufactured in large volume, to take advantage of economies of scale. For this reason, the majority of processors manufactured are general purpose, thus expanding its range of applications. In recent years, however, more and more processors are being manufactured for specific applications, including those aimed at accelerating work with deep neural networks. This article introduces the need for this type of specialized hardware, describing its purpose, operation and current implementations.Universidad de Granada: Departamento de Arquitectura y Tecnología de Computadore
COVIDGR Dataset and COVID-SDNet Methodology for Predicting COVID-19 Based on Chest X-Ray Images
Currently, Coronavirus disease (COVID-19), one of the most infectious diseases in the 21st century, is diagnosed using RT-PCR testing, CT scans and/or Chest X-Ray (CXR) images. CT (Computed Tomography) scanners and RT-PCR testing are not available in most medical centers and hence in many cases CXR images become the most time/cost effective tool for assisting clinicians in making decisions. Deep learning neural networks have a great potential for building COVID-19 triage systems and detecting COVID-19 patients, especially patients with low severity. Unfortunately, current databases do not allow building such systems as they are highly heterogeneous and biased towards severe cases. This article is three-fold: (i) we demystify the high sensitivities achieved by most recent COVID-19 classification models, (ii) under a close collaboration with Hospital Universitario Clínico San Cecilio, Granada, Spain, we built COVIDGR-1.0, a homogeneous and balanced database that includes all levels of severity, from normal with Positive RT-PCR, Mild, Moderate to Severe. COVIDGR-1.0 contains 426 positive and 426 negative PA (PosteroAnterior) CXR views and (iii) we propose COVID Smart Data based Network (COVID-SDNet) methodology for improving the generalization capacity of COVID-classification models. Our approach reaches good and stable results with an accuracy of 97.72%±0.95% , 86.90%±3.20% , 61.80%±5.49% in severe, moderate and mild COVID-19 severity levels. Our approach could help in the early detection of COVID-19. COVIDGR-1.0 along with the severity level labels are available to the scientific community through this link https://dasci.es/es/transferencia/open-data/covidgr/This work was supported by the project DeepSCOP-Ayudas Fundación BBVA a Equipos de Investigación Científica en Big Data 2018, COVID19_RX-Ayudas Fundación BBVA a Equipos de Investigación Científica SARS-CoV-2 y COVID-19 2020, and the Spanish Ministry of Science and Technology under the project TIN2017-89517-P. S. Tabik was supported by the Ramon y Cajal Programme (RYC-2015-18136). A. Gómez-Ríos was supported by the FPU Programme FPU16/04765. D. Charte was supported by the FPU Programme FPU17/04069. J. Suárez was supported by the FPU Programme FPU18/05989. E.G was supported by the European Research Council (ERC Grant agreement 647038 [BIODESERT])
Artificial intelligence within the interplay between natural and artificial computation:Advances in data science, trends and applications
Artificial intelligence and all its supporting tools, e.g. machine and deep learning in computational intelligence-based systems, are rebuilding our society (economy, education, life-style, etc.) and promising a new era for the social welfare state. In this paper we summarize recent advances in data science and artificial intelligence within the interplay between natural and artificial computation. A review of recent works published in the latter field and the state the art are summarized in a comprehensive and self-contained way to provide a baseline framework for the international community in artificial intelligence. Moreover, this paper aims to provide a complete analysis and some relevant discussions of the current trends and insights within several theoretical and application fields covered in the essay, from theoretical models in artificial intelligence and machine learning to the most prospective applications in robotics, neuroscience, brain computer interfaces, medicine and society, in general.BMS - Pfizer(U01 AG024904). Spanish Ministry of Science, projects: TIN2017-85827-P, RTI2018-098913-B-I00, PSI2015-65848-R, PGC2018-098813-B-C31, PGC2018-098813-B-C32, RTI2018-101114-B-I, TIN2017-90135-R, RTI2018-098743-B-I00 and RTI2018-094645-B-I00; the FPU program (FPU15/06512, FPU17/04154) and Juan de la Cierva (FJCI-2017–33022). Autonomous Government of Andalusia (Spain) projects: UMA18-FEDERJA-084. Consellería de Cultura, Educación e Ordenación Universitaria of Galicia: ED431C2017/12, accreditation 2016–2019, ED431G/08, ED431C2018/29, Comunidad de Madrid, Y2018/EMT-5062 and grant ED431F2018/02.
PPMI – a public – private partnership – is funded by The Michael J. Fox Foundation for Parkinson’s Research and funding partners, including Abbott, Biogen Idec, F. Hoffman-La Roche Ltd., GE Healthcare, Genentech and Pfizer Inc